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Abstract 

We investigate the condition under which the Eulerian trail of a di- 
graph is unique, and design a finite automaton to examine it. The al- 
gorithm is effective, for if the condition is violated, it will be noticed 
immediately without the need to trace through the whole trail. 

1 Introduction 

The problem of finding an Eulerian trail in a traversable directed pseudograph is 
well solved, and a counting formula is given in [3, 2]. But in some applications, 
like reconstructing a string from its composition of short substrings, as discussed 
in various contexts [5, 2, 4, 1], uniqueness rather than the exact number is mostly 
cared about, so the tedious calculation seems unnecessary. Considering a trail 
as a symbolic sequence over the set of vertices, Kontorovich showed that the 
unique Eulerian trails form a regular language [4] . We present a different proof 
by characterizing its complement, which leads to an effective implementation of 
a deterministic finite automaton (DFA) that accepts it, and gain an insight into 
its structure from the aspect of minimal forbidden words. 



2 Results 

In the following, we will freely switch the concepts from the theories of graph 
and formal language, and when the latter viewpoint is emphasized, the set of 
vertices V is noted S. 



2.1 The language 

Pevzner [6] proved that any two Eulerian trails of a digraph G can be trans- 
formed into each other by a series of operations called rotations and transpo- 
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sitions. Roughly speaking, rotations correspond to the choice of initial vertex 
if the trail is closed, and a transposition swaps the order of two paths between 
a pair of vertices in the trail. Not losing generality, we always suppose that 
the initial vertex is fixed. Thus an Eulerian trail is not unique only if it has a 
transposition 

T : uaxbzaybv — ► uaybzaxbv, (1) 
where a, b e E, and u, v, x, y, z e £*. If a = b, it degenerates to the form 

T : uaxayav — ► uayaxav. (2) 

On the other hand, only x ^ y does not assure that the transposition makes 
a trail different, e.g. let x = ba and u = v = y = z = e, then the trail in (1) 
becomes t — ababab, which is invariant under the operation, and is actually 
unique. To eliminate this case, we further request that the two a's before x and 
y on the left hand side of (1) or (2) are followed by distinct vertices. Then we 
call the corresponding transposition to be proper. 

Lemma 1. Every non-identical transposition is equivalent to a proper transpo- 
sition. 

Proof. For any transposition T(t) ^ t, we can write it in the form of (1) or (2). 
If both a's are followed by a' , then let u' — ua. 

1. If a 7^ b, then t = u'xbzaybv, where x ^ y. 

(a) If x ^ e and y ^ e, then we can write x = a'x' and y — a'y', and let 
z' = za. Otherwise, a' — b. 

(b) If x = e, then we can write y = a'y', and let x' = za. 

(c) If y — e, then we can write x = a'x', and let y' = za. 

2. If a = b, then t = u'xayav, where x ^ y. 

(a) If x ^ e and y ^ e, then we can write x = a'x' and y — a'y'. 
Otherwise, a' — a. 

(b) If x = e, then we can write y — a'y', and let x' = e. 

(c) If y = e, then we can write x = a'x' , and let y' = e. 

Therefore, t has a transposition T" : u'a'x'bz'a'y'bv — > u'a'y'bz'a'x'bv in case 
(la) or T" : u'a'x'a'y'a'v — > u' a' y' a! x' a! v in the other cases. Note T"(s) = I"(s). 
Substitute T" for T and repeat the above process, we will eventually get an 
equivalent proper transposition. □ 

We conclude that an Eulerian trail t is unique if and only if it does not have 
a proper transposition. Let L be the language composed of unique Eulerian 
trails and L' be the language composed of those with proper transpositions, 
then they are complementary to each other. 
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By the definition of proper transposition, all sequences in L' have a unified 
form 

t = uawaybv, (3) 

where u,v 7 w,y £ V* , b appears in aw, and the vertices next to the two a's are 
distinct. It results in a right-linear grammar G that generates L'\ 

S -» dS\aA a , 
* cJ3 acc \aC aa: 
B ac b — > dB ac b\dB ac d\aC c b 7 
C cb ^dD b {d^c)\bR {b^c), 
D b -> dD b \bR, 
R -> dR\e, 

where a, b, c, d run over S. Therefore, L' is a regular language, and L = L' is 
also regular. 

2.2 The finite automaton 

Technically we can construct a finite automaton that accepts L from G(L'), but 
it is more convenient to design it directly, like the following. 

Input alphabet 

S = V. 

States 

Q = P x N x C, 

where 

• P = SU{a }, where a ^ S denotes the beginning of the sequence, records 
the last inputed vertex, 

• TV = (S U {e}) m+1 , where m = |V|, records the latest followings of every 
vertex including a , 

• C = {WHITE, BLACK}™ is the "color" of every vertex. A vertex is colored 
black if it is in a circuit awa, where the vertex following the tail a differs 
from that of the head a. 

Initial state 

q = (ao,e m+1 , WHITE" 1 ). 

Final states 

F = {0, n, c) e Q | c ^ BLACK" 1 }. 
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Transition function 
1: procedure £(q, a) 



2 
3 
4 
5 
6 
7: 
8 
9 
10 
11 
12 
13 
14 



if n p 7^ e and n p ^ a then 
6 <— p 
repeat 

c b <- BLACK 
6 <- n b 
until & = p 
end if 

if c a = BLACK then 

c <- BLACK" 1 
end if 

n p <— a 



p <— a 
end procedure 

Now we prove that the DFA M — (Q, E, 5, q , F) accepts i. 

Proof. First we show L(M) C £ by proving its contrapositivc. If t ^ L, then it 
has the form of (3). The design of M assures that c becomes BLACK™ after 6 is 
inputed and remains so, thus M does not accept £. 

Then we prove L C L(M) by induction on the length of the input sequence 

£. 

Basis: For |£| = 0, f = e £ L. Since q £ F, i 6 L(M). 

Induction: For \t\ > 0, if t = sa e L, then s e L, and by the inductive 
hypothesis s £ L(M). We prove t £ L{M) by contradiction. Assume to the 
contrary that £ ^ L(M), then there are two cases: 

1. If c a = BLACK just after s is inputed, then s must have the form ubwby, 
where a appears in bw and the vertices following the two £>'s are distinct. 
Thus sa £ L', which contradicts £ £ L. 

2. If c a = WHITE just after s is inputed, then s must have the form upwp, 
where a appears in pw and the vertex following the first p is not a. Again 
sa £ L', which contradicts t £ L. 

We conclude that L(M) = L. □ 



2.3 Minimal forbidden words 

Since L is a factorial language, i.e. for any £ £ L, all factors of £ also belong to 
L, it can be determined by its minimal forbidden words (MFW) [7]. A string r 
is a minimal forbidden word of L if r ^ L while all the factors of r belong to L. 
We categorize MFW(L) into sequences in the following two forms, which compose 
a language L": 

r = axbzayb, b, (4) 
r = axaya, (5) 
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where 



1. x 7^ e or y ^ e, 

2. x,y,z £ L, 

3. x, y, z do not contain a, b, and each two of x, y, z do not contain common 
vertices. 

Theorem 2. L" = MFW(L). 

Proof. By definition all words in L" are minimal forbidden words. Then we 
prove that L" is complete, i,e. L 1 C S*L"S*. For any t £ L', it must has a 
form of (3), then r = awayb ^ L satisfies the condition 1. If it violates the 
condition 2, e.g. x ^ L, then let t — x. Repeat the above process until the 
condition 2 holds. Then if y contains a vertex c which appears in aw, t must 
have a prefix away 1 c ^ L. Therefore, t has a word r in the form (4) or (5) where 
y does not contain a, 6 or common vertex with x, z. Since reversing every edge's 
direction in a graph does not changes the number of its Eulerian trails, L is 
reversal. So we can also request that x does not contain a, b or common vertex 
with z. □ 

We can determine L" by recursion on |S|. For the simplest non- trivial case, 
say £ = {0, 1}, L" can be represented by a regular expression 001 + + 01 + 00 + 
110+1 + 10+11. 
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